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Abstract 

We derive an asymptotic formula for the number of strongly connected digraphs with 
n vertices and m arcs (directed edges), valid for m-n->ooasn->co provided m = 
O(nlogn). This fills the gap between Wright's results which apply to m = n + O(l), and 
the long-known threshold for m, above which a random digraph with n vertices and m 
arcs is likely to be strongly connected. 
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^ ■ 1 Introduction 

(N 

One of the most fundamental properties of a directed graph (digraph), and possibly the most 
useful for communication networks, is that of being strongly connected, that is, possessing 
directed paths both ways between every pair of vertices. It was long ago shown by Moon and 
Moser [8] that almost all of the 2 n digraphs with n vertices are strongly connected due to 
having paths of length 2 between each pair of vertices. So, from the asymptotic enumeration 
perspective, a more interesting problem is the enumeration of strongly connected digraphs 
on n vertices with m arcs (i.e. directed edges). In this paper, all digraphs are labelled. Our 
results cover only simple digraphs (i.e. digraphs with no multiple arcs), but unless otherwise 
stated we allow digraphs to have loops. We also give results for digraphs in which loops are 
forbidden, which we refer to as loop-free digraphs. 

Palasti |9] determined the threshold of strong connectivity, as follows. Let a be fixed 
and define m(a, n) = \n log n + an\ . Then, for a random directed graph having n vertices 
and m arcs, so that each of the (^) possible choices is equiprobable, the probability that the 
digraph is strongly connected tends to exp(— 2e~ a ) as n — > oo. Multiplying this probability by 
C^-) consequently gives an asymptotic formula for the number S(n,m) of strongly connected 
digraphs with n vertices and m arcs, for such m. This also easily implies that S(n, m) ~ [%) 
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if m = m{a n ,n) with a n —> oo. On the other hand, Wright [13J obtained recurrences for the 
exact value of S(n, m) when m = n + O(l). (We must require m > n to avoid the failure to 
be strongly connected for trivial reasons.) In this paper, we fill the entire gap between these 
results, deriving an asymptotic formula for S(n, m), valid for m — n — > oo as n — t-oo provided 
m = 0{n log n). Our main result is as follows. 

Theorem 1.1. Uniformly for m = 0{n log n) and m — n — > oo, the number of strongly 
connected digraphs with n vertices and m arcs is asymptotic to 

27r (i + A - c)A 2 ™ 6 Pi A 7 j (e 2A - e A - A)(e A - 1) ' { > 

where c = m/n > 1 and A is determined by the equation c = Ae A /(e A — 1). 

Note. In particular, if c — > 1 then the expression ([1]) simplifies asymptotically to 

(2) 



(m-l)!(e A -l) 2n 



6ttA 2to 

whilst if c — >• oo then (P) is asymptotic to 

(m-l)!(e A -l) 2n 
2ir\ 2m 



exp(-A 2 /2). (3) 



Our result has counterparts for undirected graphs. An asymptotic formula for the number 
of connected graphs with n vertices and m edges was given for all m such that m — n^-oo as 
n — t- oo by Bender, Canfield and McKay [2]. This improved the range of m for which earlier 
estimates were found, and also the bounds on the error term. A simpler approach to the same 
problem was given in [12]. This begins by counting connected graphs with no end- vertices, 
and then consider the number of ways to attach a forest. One of the ways used there to count 
connected 2-cores was to count connected kernels, which have no vertices of degree 2, and 
insert vertices of degree 2 into their edges, and another way was based on eliminating isolated 
cycles by inversion. In the present paper, for the case m = 0(n) we use this first of these 
two alternatives. This has some advantage in providing direct information on properties of 
the kernel, such as was used in [5] for studying long cycles in the supercritical random graph. 
In a similar way, we can study the analogous structure for a digraph, which we call its heart. 
For m/n — > oo we use a rather different approach to show that random digraphs with all in- 
and out-degrees at least 1 are strongly connected with high probability. 

Our argument requires a formula for the number of digraphs with all in- and outdegrees 
at least 1 and given number of arcs, which we obtain using the method for counting graphs 
with given minimum degree developed by Pittel and the second author in |llj . 

Theorem 1.2. Uniformly for m = 0{n log n), 

.„ / m m!(e A -l) 2n . , 2 , . 

' g »'"-'""~ 2,„c(l + A-c)A^ eXP '- A '' 2 » 

where c = m/n and A is determined by c = Ae /(e — 1). 



Using the same method, we also extend this to digraphs with all outdegrees at least k + and 
all indegrees at least k~. 

Theorem 1.3. Fix positive integers k + and k~ . Uniformly for m = 0(n log n), m — k + n — > 

+00 and m — k~n —> +00, the number of digraphs on n vertices, m arcs, outdegrees at least 
k + and indegrees at least k~ is asymptotic to 

("-W*-(A-)/*+(A + ))» exp( _ A - A+/2) , 



27^(1 + r/+-c)(l + it -c)A 2 
where c = m/n, 

f k (\) = Yl A V* ! > 

i>max{fc,0} 

A + and A~ are the unique positive roots of 

c = A + / fc+ _ 1 (A+)// fc+ (A+), c = A-/ fe -_ 1 (A-)// fe -(A-) 
respectively, and 

V + = (A + ) 2 / fc+ - 2 (A + )// fc+ -i(A + ), n- = (A-) 2 / fc - 2 (A-)// fc -i(A-). 

The results stated so far refer to digraphs that are allowed to have loops but not multiple 
arcs. In Section [7] we extend these results to the case when loops are forbidden, and obtain 
the following analogues of Theorems 11.11 and 11.31 

Theorem 1.4. Uniformly for m = 0(n log n) and m — n — )■ +00, the number of strongly 
connected loop-free digraphs with n vertices and m arcs is asymptotic to 



2tt(1 + A - c)A 2m ^ v ' ' ; (e 2A -e A -A)(e A -l)' 

where c and A are as in is Theorem \1.1[ 

Note that, for Theorem II. 44 the only effect of forbidding loops was to introduce the extra 
factor exp(— c(l — e~ A ) 2 ). 

Theorem 1.5. Fix positive integers k + and k~ , and recall the notation of Theorem \l.cH 
Uniformly for m = 0(n log n), m — k + n —> +00 and m — k~n —> +00, the number of loop-free 
digraphs on n vertices, m arcs, outdegree at least k + and indegree at least k~ is asymptotic to 

(m-l)!(/ fc _(A-)/ fc+ (A+)r , x _, + /9 s 

■exp(— c — A X /2). 



2tt y/{l + rj+ - c)(l + n~ - c)X 2r > 



For Theorem II. 5\ forbidding loops just gave only the factor e c . 

Cooper and Frieze [H Theorem 3(vi)] obtained a significant result relevant to this prob- 
lem, in the form of the asymptotic probability that a random digraph with given degree 
sequence is strongly connected, under certain assumptions on the degree sequence. It would 



be rather straightforward to combine this with our Theorem 11.21 along with properties of de- 
gree sequences which we use in our paper, to deduce an asymptotic formula for S(n, m) when 
m/n > 1 is bounded away from 1 and is bounded. For completeness, we derive this case of 
the formula in a different way, following the same approach as we use for the case m/n — > 1, 
which we consider in Section [SJ 

Boris Pittel [TO] has independently investigated the second approach of [12] mentioned 
above. Applying it to this problem in the loop-free case, he has simultaneously obtained a 
formula similar to that in Theorem 1 1 . 41 under the restriction that m = 0(n), but also including 
an explicit error estimate. 



2 Basics and notation 

2.1 Truncated Poisson distribution 

We consider a discrete probability distribution that will be used many times in the argument. 
Given A > and a nonnegative integer k, we say that a random variable (r.v.) Y has a 
k-truncated Poisson distribution of parameter A (or simply Y ~ TPofc(A)) if 

if i > k, 



P(Y = i) = { f k {\)i\ 

ifO<i<k, 

where /&(A) = X^>fcA l /z!. For later convenience we also define /fc(A) = e A for integer k < 0. 

We first give a rough tail bound for a random variable Y ~ TPofc(A) for constant k but A 
possibly depending on n. Consider constants A > B > e, and let p be a constant nonnegative 
integer. Then for j > m&x{Ae\, k} we have 

In particular, 

P(y > j) = O (B~i) , E(y ly>,) = O {B-i) and E([F] 2 ly^) = O (S^') . (4) 
(We use [x]k to denote the falling factorial x(x — 1) • • • (x — k + 1) throughout this paper.) 

Our main use of the TPofc(A) distribution is to allow us to make computations on the 
multinomial distribution truncated from below. The following lemma establishes a connection 
between these distributions, and will be used throughout the paper often without an explicit 
mention. (See for example [31 Section 2] for a proof of this lemma.) 

Lemma 2.1. Distribute M > kN distinguishable balls randomly into N distinguishable bins 
u.a.r. subject to the condition that each bin receives at least k > 1 balls. Let Yi be the numbers 
of balls in bin i . Then the joint distribution ofY\, . . . , Y/v is the same as that of N independent 
copies o/TPofc(A) for arbitrary A > conditional upon Y\ -\ + Y/v = M. 



It is easy to see that a variable Y ~ TPofc(A) has EY = c given by 

Henceforth, given c > fc, we assume that A is set equal to the unique (by [HI Lemma 1]) 
positive root of this equation. We also define 

A 2 / fc - 2 (A) 
/fc-l(A) 

Elementary computations show that, for such choice of A and rj, we have E(Y(Y — 1)) = rjc. 
More properties of the TPo/%(A) distribution are given in |llj . It is easy to check that < A < c 
in all cases. From [111 Theorem 4(a)] we have the following. 

Lemma 2.2. Let M = 0(N log N) be integer such that r := M — kN — > oo and put c = M/N . 
Let Yi,...,1jv be i.i.d. random variables with TPo&(A) distribution, for fixed k, where A is 
determined from c in (0j, and define n as in {6|). Then, as N — )• oo, 

p ( y 1 + ... + r„ = M)~ 7 =bl= r e(iA/F). 

Throughout the paper, we mostly focus our attention to the case k = 1 and simply refer 
to the TPoi(A) distribution as TPo(A) or simply truncated Poisson. In this particular case, 
([5]) can be rewritten as 

Ae A 
e A -l 
and moreover we have n = A. 

On several occasions we use Chernoff bounds for a binomially distributed Bin(ra, p) random 
variable X in the common form 



c = 7>r-r> M 



P(\X - np\ >a)< 2e~ 2a2 l n , (8) 

or the variation more useful when p is small: 

P(|X - np\ >a)< 2e- a2/3np for a < np (9) 

(from Molloy [7]; see also Alon and Spencer [H Theorems A. 1.11 and A. 1.13]). 

We close this subsection with some rather technical lemmas on independent variables with 
TPofc(A) distribution. 

Lemma 2.3. Let Y\, . . . , Yjy be independent r.v.s with TPo/t(A) distribution, for fixed k and 
forO<X<logN. Put C = EYi. Then for any t > VNlog 2 N we have 



p(|f>-cw|>ij 



O e 



-(i 2 /8A0 1/3 



asymptotically as N — > oo. 



Proof. Let Y max = max^}. Setting A = (t 2 /8iV) 1/3 , we have P(y max > A) < JVP(Yi > 
A) = 0(e~ A ) by gj) and since A = ft(log 4/3 N). Now define 

Wi = y<-C and W* = Wil Yi < A , 

and again from @ deduce 

|EW?| = | - E(Wi ly i>A )| < E(3$ ly. >A ) = O (e- A ) . (10) 

Moreover, we have — C < W* < A — C, and then \W*— EW*\ < A, so by the Azuma-Hoeffding 
inequality 



P (|X>? " E ^*)| ^ V 2 ) ^ 2ex P (sA^f) = 2e_A - 

P \\lt Y *- CN \ >t ) < P ( y max>A) + p('|^W i *| >t\ 

, N N 

<0(e- A )+p(|^(^-EW*) >t-|^EW* 
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N 



i=\ 



< O (e- A ) + p(| J2(W* - EW?)| > t/2) 

= O (e" A ) , 
where we used (fTTj) and the fact that | X^£i EWj*| < t/2, which follows from (fTOj) . 



D 



The following result was essentially shown in |llj . 



Lemma 2.4. Lei Yi, . . . , Yn be independent r.v.s with TPo,t(A) distribution, for fixed k and 
for < A < log N. Put C = B(Y 1 (Y 1 - 1)). Then 

P ( I ^YifXi-l) -CN > 4N 1 / 2 log 8 N J =O(exp(-log 3 A0), 

asymptotically as N — > oo. 

Proof. The statement in the lemma comes directly from equation (33) in [llj, considering 
(16), (22), (28), (29) and Lemmas 1 and 2 of that paper. See also the proof of Lemma 12.31 
which uses the same method in full detail. □ 

We will use the following for k = 1,2. 

Lemma 2.5. Let k > 1 be an integer, and let Y±,...,Yn be independent TPofc(A) r.v.s. 
Consider N bins, place Yi balls in bin i (i = 1, . . . , N ), and then select each ball independently 
with probability q < 1/2 where Nq > log N. Then the number X of bins containing at least 
one selected ball satisfies 



p(\x - ex\ > Vex log n) = & 



-n(io g 2 N) 



asymptotically as N — > oo ; and moreover 

EX/n > feg(l - (Jfe - l)/4 + (2" fc /A:)P(yi > fc + 1)). 

Proof. Let q' be the probability that a bin contains at least one selected ball. We have 

1 - q' < (1 - 9) fe P(Y = Jfe) + (1 - q) k+1 V{Y, > k + 1) 
= (l-g) fc - 9 (l- 9 ) fe P(F i >fc + l). 

Using the elementary bound (1 — q) k < 1 — kq + ( 2 )<Z 2 and the fact that g < 1/2, we obtain 

4 > kq(l - (Jfe - l) 9 /2) - 9 (1 - g) fc P(Y; > Jfe + 1) 

> kq(l -{k- l)/4 + (2- fe /^)P(^ >k + l)), (12) 

and trivially q' > q in any case. Since X ~ Bin(A r , (/'), it follows by ([9]) that 

P(\X -Nq'\ > ^/Nq 7 logN) < 2e- lo ^ N ' z . (13) 

□ 

2.2 Probability spaces of digraphs and degree sequences 

Let Q(n,m) be the set of digraphs on n labelled vertices and m arcs. In our definition of 
digraph we allow loops but not multiple arcs. It is a simple matter to adjust our arguments 
for loop-free digraphs (see Section [7]). For a given digraph in Q(n,m), let d + = (df , . . . ,d+) 
and d~ = (d^[ , . . . , d~) denote respectively the sequences of out- and indegrees of the vertices. 
The degree of vertex i is defined to be the tuple di = (df ,d~), so the joint in- and outdegree 
sequences can be represented by d = {d\, ■ ■ ■ , d n ). For feasibility, it is necessary that 

n n 

E d t = E d r = m - ( 14 ) 

Let c = m/n and assume that c > 1 throughout the article, though m and hence c are functions 
of n. Let Qi > i(n,m) be the set of digraphs in Q{n,m) such that df, d~ > 1 for all i £ {1, . . . , n}. 
(Note that this is a necessary condition for strong connectedness when n > 1.) Elements of 
Ql,l(n,m) we call (1, l)-dicores or simply dicores. We also write Q(n,m) and Q\,\{n, m) to 
denote the corresponding uniform probability spaces. We define r = m — n = (c — l)n and 
assume r — > oo. We distinguish three subcases: very sparse, with r = o(n) or equivalently 
c — )• 1; moderately sparse, with r = @(n); and a denser case, with c — )• oo but c = O(logn). 
(All logarithms are natural unless otherwise specified.) 

Let V be the set of sequences d = (d\,. . . ,d n ), with di = (df,d~) for i G {l,...,n}, 
where the In entries df and d~ are positive integers. Let V be the subset of sequences in V 
satisfying the total sum conditions (1141) . Note that D coincides with the set of all possible 
degree sequences of dicores in Gi,i{n, m). Given any d E V, let Q(d) denote the set (and also the 
corresponding uniform probability space) of digraphs with degree sequence d. Also consider 



the usual directed pairing model V(d), denned as follows. Take n bins, where the i-th bin 
contains points of two types, namely df out-points and dj in-points, and consider a random 
matching of the m out-points with the m in-points. Each element in V{d) corresponds to a 
multidigraph in the obvious way, and the restriction to simple digraphs (i.e. with no multiple 
arcs) generated this way is uniform. 

In order to study the distribution of degree sequences of Qi t i(n,m), it will prove useful to 
turn the sets V and V into suitable probability spaces, as follows. Random degree sequences 
d £ V are chosen by taking the 2n entries df and d~ as independent copies of TPo(A). Let £ 
be the event in D that (j!4j) holds, and define T> to be the corresponding conditional probability 
space. Moreover, let Vi^n^m) be the probability space of random pairings in V(d) where 
the degree sequence d is drawn from the distribution of T> defined above. Each pairing in 
Vi t i(n,m) corresponds to a multidigraph, and as will become apparent later the restriction of 
Vi } i(n,m) to simple digraphs generates elements of Qi^i (n,m) uniformly. 

We also need the notation (i+ ax = max{df : 1 < i < n} and <i~ ax = max{d~ : 1 < i < n}. 

3 Asymptotic enumeration of dicores 

Here we prove Theorems 11.21 and 11.31 by adapting the main argument of [llj. Before that, we 
need some lemmata. The following result is an immediate consequence of Theorem 4.6 in [6] 
by McKay (we just need to use the standard interpretation of digraphs with loops as bipartite 
graphs). 

Lemma 3.1 (McKay). Let d £ T> be a sequence of degrees and suppose that dmax>^max — A 
for some A = o{m}' A ). Then the probability that a random element ofV(d) has no multiple 
arcs is 



exp 



/ n 




uniformly for all d. 



The following technical result estimates the probability that a degree sequence in T> sat- 
isfies (|14p . and averages the probability that a random pairing is simple over any subset of 
degree sequences with that property. Here A and c are defined as in Theorem 11.21 

Lemma 3.2. Assume that m — n — > oo and m = 0(n log n). 

W P ° (£ )~ 2,„ e( l + A-c) =e(1 '' ( "'-" ) »- 

Moreover, if S is the event that a random pairing in V(d) or Vi t i(n,m) is simple, then 

(b) P 7 , M(n>m) (5) = E 3 (P v{cl) (S)) ~ e- A2 /2 ; 

(c) for any r.v. X on T> satisfying \X\ < x for some fixed constant x GM, 

E 3 (P nd) (S)-X) = (l + o(l))e- x2 /*E i) X + o(e- 1 ^^ 



Proof. From Lemma 12.21 the independent events ^2 i dj = m and ^2 i d~ = m each have 
probability (1 + o(l))/y / 27rnc(l + A — c), which gives (a). Note that (b) follows from (c) by 
setting X = 1, since the bound on m implies A = O(logn), so it only remains to prove (c). 
For this, we follow the proof of [IX Theorem 4(b)] almost exactly 

We require some definitions. Let F = F(a) = P~,*(S) and 

F = exp (--D+D- 

where 

-.71 1 n 

£+ = -J>+K+-l) and D- = -Y J dj{dJ-l). 

i=l j=l 

We set A = log 3 n, and let B\ denote the 'bad' event that d+ ax > A or d~ ax > A. From (HJ) 
we obtain Pj)(Z3i) < 2raP(Y > A) = 0(nA~ ). Then, we use the result from (a) to deduce 
that P g (£i) < Pd(Bi)/Pd(S) = 0(n 2 c(l + A - c)A" A ) = 0[er Xo ^ n ). 

In view of Lemma 13. II and bearing in mind that < F, F < 1 and \X\ < x, we can write 

E^(FX) = E$(FX l Bl ) + E^(FX lg-) 

= 0(P 3 (B 1 )) + (1 + 0(A 4 /m))E^(FX 1^) 

= 0( e -io S 3 «) + (i + 0(A'/m))B 3 (FX 1=-). (15) 

Simple computations show that EZ) + = ED - = A (with D + and D~ independent). Set 
t = Sn^^log^n, and define B 2 to be the 'bad' event that \D + D~/2 - A 2 /2| > t. Whenever 
B 2 does not hold, we have F = exp(-A 2 /2 + 0(t)) = (1 + 0(t)) exp(-A 2 /2), so 

E 5 (FX 1^) = E 3 (FX 1=- AB2 ) + E^FX 1=~ A =-) 

= 0(P fi (ft)) + (1 + 0(t)) e" A2 /2 E . (X) . ( i 6) 

It only remains to bound P,g(02). Set s = t/(2\ogn) = 4n _1 ' 2 log n, and note that if 
\D+ - A| < s and \D~ - A| < s then 

|D+L>-/2 - A 2 /2| < \\D + - \\\D~ - A| + X\D+ - A| + A|D" - A|) < g2 + 2sl °g n < t. 
Therefore, by Lemma [2 



P$(B 2 ) < P$(\D+ -\\>s) + P 3 (\D- - A| > s) = 0(e~ l ^ n ). (17) 

Part (c) in the statement follows by combining (|15p . (|16p and (|17p . D 

Now we are in good shape to prove the theorem. 

Proof of Theorem 11.21 Observe that 1^(^)1 = m\, and that each simple digraph with degree 
sequence d comes from exactly Y[7=i dt-d~l different pairings in V(d). Thus 

9 



where S denotes the event that a random pairing in P(d) has no multiple arcs. Define 



d&V 



dt\dj\ X 2r ' 



i=\ * ' * 



Therefore, summing over all degree sequences, we can write 



101,1 (n,m) | = ^™ 



m!p P(d)(^) 



Hl-i dp.d~\ 

d£V 



ml 



J V \* V(d) v 

(e A - l) 2 n 
m!(e A - l) 2n 



- l) 2n 

■^ E Pi,i(n ) m)(5')Pli (£) 

exp(-A 2 /2), 



2vrnc(l + A - c)A 2r ' 
where we used Lemma 13.21 D 



In addition, the computations in the proof of Theorem 11.21 give the following. 

Corollary 3.3. The elements in Qi,x(n,m) can be uniformly generated by restricting the 
probability space Pi,i(«, ?7t) to simple pairings and considering the corresponding digraph. 

Proof. A dicore G in Qix(n,m) with degree sequence d comes from exactly n3Lid^"!(C~! 
different pairings. Each of these pairings must be simple and has probability 

X 2m /(e x - l) 2n -i 

^nr=i^ + ^! (PpM(n,m)(5)) (18) 

in the space V\ t \ (n,m) conditional upon the event S of being simple. The product of (fT8l) 
times XYl^dfld^l does not depend on the particular d, and therefore the distribution of G 
when generated from simple pairings is uniform. □ 

Finally, we can extend the concept of dicore defined in Section [2] as follows. Given k = 
(k + ,k~) where k + and /c_ are positive integer constants, a k-dicore is an element of Q(n,m) 
with a degree sequence satisfying df > k + and dj > k~ , for all i G {1, . . . , n}. Let Qk(n,m) 
denote both the set of £;-dicores and the corresponding uniform probability space. 

In order to study the degree sequences of Qk(n,m), we need some definitions. Let A + and 
r) + (resp., A~ and r/~) be obtained from ([5]) and © after replacing k, A and r\ by k + , A + and 
r] + (resp., by k~, A~ and r)~). Define the set of degree sequences V^ analogously to V, with 
the extra condition that df > k + and dj > k~, for all i G {1, . . . ,n}, and similarly let P& 
be the subset of sequences in T>k satisfying (]14h . Moreover, we endow T>k with a probability 
distribution by selecting the df and the d~ independently according to the TPo fc +(A + ) and 
the TPo fc -(A~) distributions, respectively. The V^ space is simply V^ conditional upon ([T3]) . 
Furthermore, we define Vk{n, m) as we did for Pi i(n, m) but randomising the degree sequence 
d according to the distribution of P/% defined above. 

Now we are in good shape to extend the argument in the proof of Theorem 11.21 to general 
/c-dicores. 

10 



Proof of Theorem 11.31 The proof is straightforward by going along the same steps as the 
proof of Theorem 11.21 but replacing V, V and Vi t i(n, m) by T>k, T>k and Vk{n,m), and 
considering the distributions TPo fc +(A + ) or TPo fc -(A~) instead of TPo(A) when appropriate. 
The key part is extending Lemma 13.21 to the new setting, which is also straightforward. The 
extended statement is as follows. Assume that m — k + n — > oo, m — k~n — > oo and m = 
O(rologn). Then 

(a) P Cfc (S) ' 



2itnc^(\ + rj + — c)(l + r}~ — c) 
Moreover, if S is the event that a random pairing in ~P(d) or Vk{n,m) is simple, then 

(b) P n(n , m) (5) = E A (P V(S) (S)) ~ e - A+ ^/ 2 ; 

(c) for any r.v. X on T>k satisfying \X\ < x for some fixed constant i£l, 

% (*V)(S) • X) = (1 + o(l)) e- A+A "/ 2 B 3 X + O (e- 1 ^ 3 ™) . D 

4 Moderately sparse case: c bounded 

In this section we will prove Theorem 11.11 for the case that c = m/n is bounded and also 
bounded away from 1. 

A sink-set in a digraph G is a non-empty proper subset S of vertices such that the out-set 
of S is a subset of S. That is, no arc goes from S to ^(G) \ S. A set of vertices is a source-set 
if its complement is a sink-set. A sink-set in a digraph with minimum outdegree at least 
1 is plain if its vertices all have outdegree exactly 1, and is otherwise complex. Plain and 
complex source-sets are defined analogously by replacing outdegree by indegree. Observe that 
a digraph G is strongly connected iff it has no sink-set (and equivalently no source-set). We 
use the term s-set to denote sets of vertices which are a sink-set or a source-set. 

We first show that a.a.s. any complex s-set of Qi^i(n, m) must contain more than m/2 arcs. 
Therefore, the strong connectedness of <5i,i(n, m) can be characterised in terms of plain s-sets. 

Proposition 4.1. Suppose that c = m/n is bounded and bounded away from 1. A digraph in 
Gi^i(n,m) a.a.s. has no complex s-set containing at most m/2 arcs. 

Proof. It is presumably possible to analyse £^1(71,771,) or Vi,i(n,m) directly to achieve the 
desired result, by an expectation argument similar to that commonly used for connectivity of 
graphs. However, the expectation itself seems to be difficult to analyse. Instead we introduce 
another probability space, by partitioning according to the indegree sequence and to the 
multiset of outdegrees. More precisely, we will consider slices of Vii(n, m) with indegree 
sequence d~ and outdegree sequence being a permutation of d + , for each d G V. 

One could argue by partitioning according to the joint values of d~ and d + , but certain 
nasty combinations of in- and outdegrees, in which the vertices of outdegree 1 all have large 
indegree, are likely to cause trouble, and rather ad-hoc arguments may be required to bound 
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the troublesome cases (see e.g. the approach in [3]). It is conceivable that allowing permuta- 
tions of the outdegree sequence instead helps to explain a little more of the structure of the 
typical digraph in Q(n,m). 

To facilitate calculations of probabilities, for each d £ V we introduce a probability space, 
V(d), which is similar to common models (called pairing or configuration models) for random 
graphs or digraphs with given degree sequence. Consider two sets of points A = {a\, . . . , a m } 
and B = {b\, . . . ,b m }, with A partitioned into nonempty sets (which we call bins) Ai, i = 
1, . . . , n (corresponding to the vertices of the digraph) with \A^\ = d^ for each i, and similarly 
B partitioned into nonempty sets (bins) Bi, i = 1, . . . , n with \Bi\ = d~ for each i. We write 
a{a,i) = j if a, G Aj, and (3(bi) = j if bi G Bj. A random element of V(d) is a random 
bijection : A —> B together with a random permutation a of [n], such that the pair ((f), a) is 
chosen u.a.r. Each element in V'(d) can be mapped in a natural way to a pairing in Vi t i(n, m), 
obtained by identifying points in -A CT (j) and points in Bj with out-points and in-points of bin j. 
This corresponds in turn to a multidigraph M which has an arc (u, v ) for each point ai £ A a < u \ 
such that 4>(cii) £ B v , or equivalently the arc (multi)set is {a" 1 (a(aj)) /3((/>(aj)) : 1 < i < m}. 
Observe that M has indegree sequence d~ and an outdegree sequence which is a random 
permutation of d + . As usual, all graph theory statements referred to an element in V '(d) 
should be understood in terms of the corresponding multidigraph. 

Define U to be the event, defined on any relevant probability spaces, that there is a 
complex proper sink-set containing at most m/2 arcs. Ultimately, we will do the calculations 
in the space 'P'(d) with d randomised according to its distribution in the space T>. Call this 
space V'i i(n,m). Averaging over d makes computations a little easier than arguing about its 

typical values. In fact, observe that the distribution of a random degree sequence d G T> stays 
invariant if we randomly permute the entries of the outdegree sequence d + . Hence, we deduce 
that 

P Vl An,m)(U) = E^ (P T tf(UJ) = E£ (Pp, (d )(£/)) = Pp( >,m)(£0- (19) 

Thus, in view of Corollary 13.31 and Lemma l3.2^ we have 

P5 lilM (P) = Pp w m(^ I S) < (l + o(l))e x2 / 2 P VlAn , m) (U) = 0(P r[i{n:m) (U)). (20) 

Therefore, we only need to show that P-p/ ( nm )(U) = o(l) in order to prove the theorem 
statement for complex sink-sets. The result extends immediately to complex source-sets by 
considering the converse digraph. 

The remainder of the proof consists of bounding the probability that an element of 
Vy i(n,m) has a complex sink-set with at most m/2 arcs. Observe that, if S is a complex 
sink-set and vq is a vertex in S with outdegree strictly greater than 1 (there must exist at 
least one of these because S is complex), then the set S' C S of vertices reachable from vo 
is also a complex sink-set. Therefore, we only need to consider complex sink-sets which are 
precisely the set of vertices reachable from some vertex vo. 

Given a vertex vo, the following algorithm will terminate with S being the set of vertices 
reachable from vq. The algorithm works by maintaining a set S of bins Ai corresponding to 
vertices reachable from vo, and investigating the vertices reachable from S. It does this by 
looking at the points in bin in S. The set T contains precisely such points which have not yet 
been investigated. 
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Algorithm 

Let vq be the initially chosen vertex. Start with S = {vq}, T = A ff f vo \, and repeat the following 
until T is empty. Pick ai G T, add to S the vertex v = /3(0(aj)) (if it is not already there), 
delete a^ from T and, if v was not already in S, add all elements in A a t v \ to T. 

If the algorithm terminates with S being a complex sink-set containing at most half of the 
arcs of M, we say that it terminates properly, and otherwise improperly. We complete the 
proof of the theorem by showing that the probability that there exists a vertex vq such that 
the algorithm terminates properly, when begun from vq, is o(l). 

As is common in analysing algorithms like this, we will make use of the fact that, con- 
ditioning on any set of values of a uniformly random permutation, the remaining values are 
still uniformly at random. Thus, the algorithm can be performed simultaneously with the 
generation of the random bijection <f> and permutation a. At the start, (f> and a entirely un- 
determined and we can choose 4>(ai) at random from the unused points of B at each step of 
the algorithm. Similarly, we may choose ct(vq) initially, and then cr(v) at each step where the 
vertex v was not already in S, randomly from the indices i of the unused bins A{. Thus, we 
may initially choose u.a.r. a permutation (pi, ... , <j) m of B, and independently a permutation 
o"i, . . . , a n of [n] u.a.r., and use (f>\ for the first value of cfr called for in the algorithm, 4>2 for the 
second, and so on, and similarly for a. Set K^ = {4>\, . . . , 4>k} and J s = {a\, . . . , a s }. Since 
the 4>i and o~i are pre-chosen randomly, it follows that, for given k and s, 

for given k and s, J s C [n] and K^ C B are independent and u.a.r. (21) 

In particular, the joint distribution of J s and K^ does not not depend on the algorithm, which 
is the important feature that simplifies analysis. 

Now define 

k=J2\ A il ( 22 ) 

and let U VQ denote the event that the algorithm terminates properly, with k and s defined as 
above, in particular with S being a complex sink-set with at most m/2 arcs. In the event U Vo , 
since the termination condition implies that T is empty, it follows that 

k = k. (23) 

Also define 

s = \{ u : u = v or K k n B u ± 0}|. (24) 

Note that at each step, since /3(0(aj)) is added to S, we have S = {vo} U {u : K^ D B u ^ 0} 
and hence 

s = s. (25) 

Moreover, the fact that S is complex is equivalent to the condition that there are more arcs 
chosen than vertices in S, and so k > s. Hence, an upper bound on P(U V0 ) is the probability 
that (j23|) and (|25p hold for some k and s with k < m/2 and s < k. 



Denote the event that (|25|) holds, given k and s, with s generated according to (|24j) 
given (f2Tj) . by H^ s , and similarly the event that ([23]) holds, given k and s, with k generated 
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according to ([22]) . by H^ g . Also put Hk )S = H^ s /\H ks . We will prove that 

P( U U F ^)=«(0- (26) 

^k<m/2 s<k ' 

Then P-p' ( nm ) (L^ ) = o(rT v ), and the result follows by taking the union bound Pp' ( n ,m) (U) < 
E« P 7»{ il (n,m)(f«o) and from ©• 

It only remains to show (|26p . for which we split k into two intervals. 

Case 1. log n < k < m/2. 

We first bound probabilities in the distribution of s as determined by (}24"|) . Recall the truncated 
Poisson distribution as defined in Section [2j Let fi = 0(n, c, a) denote the probability space in 
which there are n bins Bi with \Bi\ = d~ , where d± , . . . , d~ are independent random variables 
each with the distribution of TPo(A), and such that a random subset T of the points in the 
bins is chosen by including each point independently with probability 

q = k/m. 

Let s be the number of the bins that are either occupied by at least one point of T or happen 
to be the bin vq. It follows from Lemma 12.51 that 

Pfi(|S - nq'\ > ^fnq'Xogn) = o(n~ 3 ), (27) 

where q' is the probability that a bin contains some point of T. Note that q' > g(l+e) for some 
positive constant e that can be determined from (|12p . Now define E to be the event in Q that 
the total content of bins is X^=i d~ = m and that exactly k points are chosen in T. Observe 
that, in the probability space conditional upon E, the number s of bins containing at least 
one element of T is distributed as in the definition of H^ s . From Lemma 12.21 Yu7=i d^ = m 
holds with probability 0(n -1 ' 2 ), and — conditional on that — the event \T\ = k has probability 
®{k- 1 / 2 ), since \T\ ~ Bin(m,/c/m). Hence, Pq(E) = f^n" 1 ) and by $2j 



P |J H ks \=P n (\s-nq'\ > y^q 1 log n\E)=o(n~ 2 ). (28) 

\ \s— nq'\>y/nq' logn / 

The next (and simpler) step is to define O' = £l'(n, c, s) to be the probability space in which 
there are n bins Ai with \Ai\ = df all independent random variables each with the distribution 
of TPo(A), and such that a uniformly random set of s of the bins is chosen. Assume that s 
lies in the range \s — nq'\ < ^nq'logn, and in particular s = Sl(log 4 n). Let k be the total 
number of points in the selected bins. From Lemma 12.31 we obtain the tail bound 

k < ^-j^j < e- ^ 1/3 ) = o(n-% (29) 

Now let E' be the event in Q' that Y17=i df = rn i and recall from Lemma 12.21 that Pqi(E') = 
B(n -1 ' 2 ). Observe that in $7' conditional upon E' the distribution of k is the same as the 



14 



one in the definition of Ht . Moreover, the fact that \s — nq'\ < y/nq' log n implies that 
k < cs/(l + e + o(l)). In view of all that and from (|29p . we obtain that for \s — nq'\ < ^/nq 7 logn 



P(flJ.) < P«' (* < 1 + e + o(1) I #) = o{n- A ). (30) 

Taking the union bound over all s such that \s — nq'\ < y/nq 7 log n, combining it with ()28[) and 
summing over all k between log n and m/2 completes the proof of (|26p for this range of A;. 

Case 2. k < log 4 n. 

For del), the event d~ ax < log 2 n, or equivalently |I?j| < log 2 n for each i, holds with 
probability 1 — o(n _1 ). This follows readily from bounding the probability of the complement 
in V and then conditioning upon £\ d^ = m (see (jl|) and Lemma I2T2J) , Since V[ 1 (n, m) is just 

V(d) with d distributed as in V, we may focus on V(d) for a particular d satisfying the above 
property. Note that s = s < k according to ([2^|) if the random choice {<j>\, . . . , 4>k} of elements 
of B determines at most k — 2 bins other than vq. This has probability 0(/c 4 (log n/m) 2 ). 
Hence, in V(d) 

E P (U F m)= E P(^<fc-l) = 0(log 2 Vn 2 ) = o(l/n). D 

fc<log n s<k k<\og n 

Next consider plain s-sets of Qi t i(n,m). 

Proposition 4.2. Suppose that c = m/n is bounded and bounded away from 1. The probability 
that a digraph in Qi^i{n,m) has no plain s-set is asymptotic to 

e Af e A _ 1 _ A \2 

(31) 



(e 2A -e A -A)(e A -l)' 
with A determined by the equation c = Ae A /(e A — 1). 

Proof. The simplest sink-sets or source-sets are those whose vertices induce a directed cycle. 
Call them sink-cycles or source-cycles accordingly. An s-cycle is just a set of vertices which is 
either a sink-cycle or a source-cycle. Observe that each plain s-set must contain one s-cycle, 
so we can restrict our attention to s-cycles. 

For any constant natural k > 1, let Ck be the number of s-cycles of order at most k. Let 
D be the number of double arcs. Define 

A 2(c/e x y - {c/e 2X y 

a** = L. j • 

i=i J 

Easy computations show that 2(c/e A )- ? > (c/e 2A ) J , so that there are no cancellations in any 
term of the definition of /^. We first claim that E-p x 1 ( n . m )Cfc ~ fi^, E Pl 1 r n ^ m \D ~ A 2 /2, and 
moreover Ck and D are asymptotically jointly independent Poisson. Elementary calculations 
show that 

/( e 2A_ e A_ A )/ e A_-Q 

ii = lim ill, = log T-r~ ; : ttt5 

p lb-+oo p * V e A (e A -l-A) 2 
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On the other hand, we claim that the probability there is an s-cycle of order greater than k 
can be bounded by some function /& such that lim^oo /& = 0. In view of all this, setting S 
to be the event that Vx,i(n,m) has no multiple arcs and V to be the event in Qx,l (n,m) or 
Vl t i(n, m) that there are no s-cycles, we get 

Pp MM (yns)~e-"- A2 / 2 , 

Then the proof of the result follows immediately from Lemma I3.2f b) and the fact that 

P Gi,i(n,m)(y) = P Pi,i(n,m)(y I S). 

Now we proceed to verify the claims we made about Cfc, D and the expected number of 
"long" s-cycles. To make the computations easier, we generate the elements of Vx,i{n, m) using 
a slight variation of the V[ 1 (n, in) model in which the in-points a\, . . . ,a m (resp. out-points 
bi,...,b m ) are assigned independently and u.a.r. to the in-bins Ax,..., An (resp. out-bins 
Bx, ■ ■ ■ ,B n ) conditional upon each bin receiving at least one point (note that the degree 
sequence thus obtained is distributed as in V). In addition to that, a random bijection (j> of 
the out- and in-points, and a random permutation of the labels of the out-bins are chosen as 
before independently and u.a.r. (alternatively we may consider a to be a random bijection of 
the out- and in-bins). 

First we wish to compute the joint factorial moments of Ck and D. We shall index all 
possible s-cycles of length at most k by their position (i.e. the vertices they use in cyclic order). 
More precisely, the position of a cycle of length £ is determined by a tuple of £ distinct in-bins 
Bi x , . . . , Bi t given in cyclic order together with and ordered tuple of out-bins Ai 1 , . . . , Ai t . A 
random element of V{ x(n,m) has an s-cycle at c, if it has an s-cycle on vertices vx,- ■ ■ ,vt 
where each vertex Vj corresponds to the bins A a u.\ and B{ . 

Fix ci, . . . , c r , where each a is the position of a cycle of length 1$ and the bins used for 
each position are pairwise disjoint. Let X C1 ^ be the indicator function for the event that 
there is an s-cycle at each position q. We compute the probability that this event holds. The 
probability that the out-bins are assigned to the corresponding in-bins is 

l/[n}e 1+ ...+i r ~ l/n^+- +fc . (32) 

Condition on this, and note that the degrees of the bins and the matching of the points occur 
independently from that. Now we claim that the probability that the right s-cycles occur at 
ex , . . . , c r is asymptotic to 

JJ(2a*7n* ~ a Ui /m ei ), (33) 

i 

where a = A/(e — 1) = c/e . Observe that the events of having a sink-cycle or having a source- 
cycle at Cj are not disjoint, so the probability of the union is the sum of probabilities minus the 
probability of having both a sink- and a source-cycle at q. Thus, in order to estimate (j33|) . we 
can specify for each q one of the three former events (sink-cycle, source-cycle or both). More 
precisely, given ri+r2+r^ = r and relabelling ex, ■ ■ ■ , c r to ex, ■ ■ ■ , c n , c' l5 . . . ,d T2 ,d{, . . . ,c'/ 3 and 
lx, ■ ■ ■ , £ r to £x, ■ ■ ■ , £ ri , &'x-> ■ ■ ■ i ^r 2 > A') • • • i K 3 1 we shall compute w.l.o.g. the following probabil- 
ity: having a sink-cycle at ci, . . . , c ri (and possibly a source-cycle too); having a source-cycle 
at dx,. . . ,c', 2 (and possibly a source-cycle too); or having both a sink- and a source-cycle at 
d[ , ■ ■ ■ , d£ 3 . We shall see that this probability is asymptotic to 
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(34) 



and this leads to (j33|) by easy inclusion-exclusion. 

To arrive at (|M|) , we require that t\ H \- £ ri + £'( H \- £" specific out-bins determined 

by c\ + • • • + c ri + d{ + • • • + d' r , A contain exactly one point each. By symmetry the probability of 

this is (E[AT]^ 1 _| vir +i n^ h ^» )/[n]^ 1 _| ^ r + ^» H \_pi , where N is the number of out-bins with 

only one point. N is concentrated around an by [31 Lemma 1] since the distribution of balls 
in bins is truncated multinomial. Hence EfiV]^ ^ yi r + ^"_| ^« ~ (an) ri 1 r 3 

and the probability is (1 + o(l))a 1 '" ri 1 '" r 3. Analogously, we need that some specific 
£' 2 + • • • + £' r2 +£'[ + ••• + £'r 3 in-bins contain only one point, which is independent from the 
previous and has probability (1 + o(l))a 2 r 2 ^ r 3. Conditional upon all this, we 

need that for each ci, . . . , c ri , the only point in each out-bin is matched to some point in the 
corresponding in-bin; for each d 1 , . . . ,d , the only point in each in-bin is matched to some 
point in the corresponding out-bin; and for each d[, . . . , d' , the only point in each out-bin is 
matched to the only point in the corresponding in-bin. Observe that the number of points in 
these out-bins that have not been exposed remains independent truncated Poisson conditional 
to fixed sum m — £\ + • • • + £ n +£'{ + ■■■+ £'L. An analogous thing happens for in-bins 
that were not exposed and the sum of their degrees is m — £{ + ■ ■ ■ + £' r2 +£'{ + ■■■+ £" 3 - 
The probability of matching the in- and out-points appropriately for s-cycles at d[, . . . , cj,' 

is l/[m]^»_| |_£« ~ 1/m 1 " r s. We condition on that and on the event that no out-point 

corresponding to Ci,...,c ri is matched to any in-point corresponding to d 1 ,...,d (which 
happens with probability 1 + o(l)). This makes the construction of the remaining sink-cycles 
independent from that of source-cycles. For the sink-cycles, we have to match the only point 
in each out-bin with some point in the corresponding in-bin. By symmetry, the first matching 

has probability l/(n — £' x + • • • + £' r% + £'[ -\ + £" 3 ) ~ V n - Conditional to some matchings 

being exposed, the probability that the next out-point is matched to a point in an in-bin which 
contains one matched point already is 0(l/n) since there is negative correlation between the 
events that two out-points are matched to in-points in the same bin (condition on any given 
degree). Thus that out-point is matched to some point in an unexposed in-bin with probability 
1 + o(l) and conditional to that, again by symmetry, chooses the right in-bin with probability 
(1 + o(l))/n. This gives a probability (1 + o(l))/n lH ^ r i for having the matchings required 
for the sink-cycles. Analogously, the source-cycles give a (l + o(l))/n x '" r 2 factor, and this 
establishes the estimate (f34j) . 

So far we were dealing with fixed cycle tuples ci, . . . ,c r . Let C^ be the random number 
of s-cycles occurring. To compute the r-th factorial moment it suffices to multiply (I32p and 
(|33p by the number of ways of choosing r different c\ , . . . , c r , which is 



E 

ei,...,£ r £{l,-,k} 



(Mtl+-+ir 



Hence, 



E[Ck ,.J±^fMi)'\^ 



\£=1 



Let D be the number of double arcs occurring. Recall that the out-points are placed in the 
out-bins (and the in-points in the in-bins) uniformly at random and independently conditional 
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upon getting at least one point in each bin. We index double arcs according to their position, 
where each position j is a set of two different out-points in the same out-bin along with a set 
of two different in-points in the same in-bin. Let Z be the number of positions for double arcs. 
We have the trivial bound Z < m 4 . Combining together Lemmas 12.14 12.41 and 12.21 we have 
that \Z - (Xcn/2) 2 \ < n ls with probability 1 - 0(n 1 / 2 e _u « Sn ). Hence, B[Z] S ~ (Acn/2) 2s . 
We say that there is a double arc at j if the out-points are matched to the in-points in any of 
the two possible ways. Fix s different positions ji, . . . , j s . The probability of having double 
arcs at ji, . . . , j s is is 2 s /\rni\i s . Therefore, 

E[D] S = B[Z} m 2 s /[m} 2s ^(X 2 /2) s . 

In order to compute joint moments of Ck and D, we condition to s-cycles happening at some 
fixed positions ci,...,c r and we specify the type of each cycle (sink-, source- or both) in 
the same fashion we used in the previous computations (sink does not exclude source and 
vice- versa). To make computations easier we also condition on the particular in-points and 
out-points matched to create the s-cycles. Conditional upon all this, we compute E[D] S . The 
same computations we did before are still valid if applied to the in- and out-points that were 
not used in the construction of the s-cycles, and this yields the the same asymptotic value 
(X 2 /2) s . So 

E[c tlr pi,~(i: 2 °""f /c) " ) r ^/2)- 

The claim that the distributions are asymptotically independent Poisson now follows by the 
standard method of moments. 

It only remains to bound the probability of existence of s-cycles of length greater than k 
by some function fk such that lim^oo /& = 0. It is enough to deal with sink-cycles, since the 
result for source-cycles follows by considering the converse digraph. Take a length £ > k. We 
now condition on the number N as defined above. We can choose ([^J^) 2 /^ different positions 
for such a cycle. For each of these, the probability that the bins are matched the right way is 
l/[n]£ (regardless of N). The probability that each of the £ out-bins contains exactly one point 
is at most (N/n) . The probability that each of the £ out-points is matched to a point in the 
corresponding in-bin is at most l/[n]£ (since conditional upon i matched pairs, the probability 
of the next matched pair is l/(n — i) times the probability of hitting a point in an in-bin not 
previously hit). This is again regardless of N . 

Putting this together, this expectation is at most 

£(JV/f»)VJ. 

£>k 

this tends to for large k provided N/n < (a + l)/2. The probability that N is larger than 
this is o(l) by the concentration mentioned above. □ 

From Theorem 11.21 Proposition 14.11 and Proposition 14.21 we immediately obtain Theo- 
rem [Tj] for the case c is bounded and bounded away from 1. 
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5 Very sparse case: c — > 1 

Here we proof Theorem 11.11 for the case c — > 1. Thus r = m — n = o(n), and we assume 
r = m — n — > oo. We define the directed graph analogue of the kernel of a graph as follows. 
A cycle component of a digraph is a connected component which is simply a directed cycle. 
A digraph with each in-and outdegree at least 1 and with no cycle components is called a 
preheart. The heart of a preheart G is the multidigraph H(G) obtained from G by repeatedly 
choosing a vertex v of in- and outdegrees both 1, deleting v and its two incident arcs uv and 
vw, and inserting the arc uw. The condition that G contains no isolated cycle ensures that 
the heart is always a multidigraph. The vertices of H(G) are just the vertices of G of total 
degree at least 3. 

Note that a digraph is strongly connected iff it is an isolated cycle or a preheart with 
strongly connected heart. Thus we may use ideas similar to those in Section |4] to study the 
heart, as a key step to enumerate strongly connected digraphs. Connectivity properties of the 
heart can be easier to prove than for just the (1, l)-dicore. In the dicore, some complex s-sets 
can involve many vertices of in- and outdegree 1 and just a few other vertices. We will focus 
on the heart and also use randomisation of the in- and outdegree sequences, as in the V(d) 
and V' (d) models in Section [U 

Consider any given degree sequence d G T>, and let T = T{d) = {i : df + d~ > 3}. We put 
n' = \T\ and m! = Y^i^T^t = SieT^i~' an< ^ n °te that m — n = m! — n'. For simplicity of 
presentation, renumber the vertices if necessary so that T = [n']. 

Let 71(d) be the probability space of heart configurations generated as follows. For each 
i G T consider consider a bin containing labelled points of two types, namely df out-points 
and dj' in-points, and then choose a random matching of the in-points with the out-points 
(there are m! of each kind). Note that each heart configuration in 71(d) corresponds to a 
multidigraph on vertex set T obtained in a natural way by identifying bins with vertices and 
adding an arc (u, v) for each out-point in u matched to an in-point in v. 

Moreover, given a heart configuration H, we construct a preheart configuration Q by 
taking an assignment of [n] \ T to the arcs of H (i.e. the pairs of matched up points), such 
that the numbers assigned to each arc are are given a linear ordering. Denote this assignment, 
including the linear orderings, by /. Let Q(d) be the probability space of random preheart 
configurations created by taking H G 71(d) and choosing / u.a.r. Note that each Q G Q(d) 
corresponds to a multidigraph with n vertices, m arcs and degree sequence d. Henceforth, 
any graph terminology referring to a heart or preheart configuration should be interpreted in 
terms the corresponding multidigraph. 

Lemma 5.1. The digraphs generated from the restriction of Q(d) to simple preheart configu- 
rations (i.e. with no multiple arcs) are uniformly distributed. 

Proof. Each simple digraph comes from Y\7=i ^i • < ^T' different preheart configurations. □ 

As will become apparent later in the argument, it turns out that the degree sequence 
distribution induced by the uniform probability space of all prehearts on n vertices and m 
arcs is close in some sense to V. This motivates considering the probability spaces 7i(n,m) 
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and Q(n,m), denned by choosing a random element from %{d) and Q(d) respectively, where 
the degree sequence d is also random and distributed as in V. 

Given a degree sequence d G T>, we distinguish four kinds of vertices depending on whether 
their in- and outdegree are equal to 1, or larger. For i,j G {1, 2}, let iVjj be the set of vertices 
with indegree of type i and outdegree of type j (type 1 means 1 and type 2 means greater 
than 1). Let a = (01,1, ai i 2, 0,2, 1,02,2)1 where Ojj = |^Vj,j|- This a is of course a function of 
d. Observe that any a which is feasible (i.e. occurs in T> with nonzero probability) satisfies 
ai,i + 01,2 + 02,1 + «2,2 = w, 1 < ai,2 + 02,2 < ?* and 1 < 02,1 + «2,2 < r. Conversely, it is easy 
to check that, for sufficiently large n, any nonnegative tuple a satisfying the above conditions 
is feasible. Note also that n' = a\^ + «2,i + «2,2 and m' = r + 01,2 + 02,1 + 02,2- 

We will want to condition on "typical" values of a. Denote by T the event that 
\a-i,2 — r\ < \/rlogr, |o2,i — r\ < \/rlogr and 02,2 < max{2r /n, \/r}. 
Note in particular that T implies 

n ~ 2r m ~ 3n'/2 ~ 3r, ai^ ~ 02,1 ~ r, 02,2 = o(r). (35) 

We next show something somewhat stronger than P^(r) = 1 — o(l). 

Lemma 5.2. 

E 5 (m / (l-l r )) = o(l). 

Proof. First we observe that m! is deterministically at most 2>r in P. This upper bound is 
immediate from the fact that the underlying undirected graph of the heart has n! vertices, 
m! = n' + r edges and average degree 2m! /n' > 3. Hence, by Lemma l3.2f a). it suffices to 
bound the probability that T fails by o(l/r 2 ) in T>. Here, ai : 2, «2,i an d 02,2 are binomially 
distributed with expectations r, r and r 2 /n respectively. (Note that r — > 00, but r 2 /n need 
not be large.) Hence, standard bounds (if r grows very slowly, does not suffice, but in 
any case we can simply consider ratios of consecutive binomial probabilities) shows that the 
conditions on a\ 2 and 02,1 in the definition of V hold with probability 1 — o(l/r 2 ). A similar 
argument ensures that 02,2 has the required concentration with probability 1 — o(l/r 2 ), but 
the analysis is split into two cases. If r < n 3 ' 5 , then r 2 /n < r 1 ' 3 and we easily bound the 
probability that 02,2 > \A*j f° r instance by comparing with a binomial with mean r 1 ' 3 . On 
the other hand, if r > n 3 ' 5 , we bound the probability that 02,2 > 2r 2 /n using ([9]). □ 

This result allows us to condition on feasible a satisfying T. In fact, for any given feasible 
tuple a, we denote by H(a) and Q(a) respectively the probability spaces 7-L(n, m) and Q(n, m) 
conditional on having that particular a. 

Lemma 5.3. Let a be any feasible tuple satisfying V . Then a random heart configuration in 
7i(a) a.a.s. has no complex s-set of at most m! /2 arcs. 

Proof. The argument shares many features with the proof of Proposition 14.11 i n particular 
using auxiliary randomisation to simplify computations. Let iV' denote fn'] (which was also 
T, the relevant set of vertices for the heart configuration). For each d £ T> consider, as in the 
definition of T"(d), two sets of points A = {ai, . . . , a m '} and B = {bi, . . . , b m >}, partitioned 

20 



respectively into bins A\, . . . , A n i and bins B\, . . . , B n i, with \Ai\ = d~l and \Bi\ = d^ for each 
i G N' . We write a(a,i) = j if en G A,-, and /3(bi) = j if 6j G -Bj. Define the probability 
space 7i' '(d) to be a random bijection </> : A — >■ i? chosen u.a.r. together with two random 
permutations a and r of [n'], chosen independently of </> and of each other and u.a.r. subject 
to the conditions that d + ,. N = 1 whenever df = 1, and d~,-, = 1 whenever dj = 1. We need an 
appropriate randomisation of the degrees. Thus, consider the probability space 7~L'(a), whose 
elements are selected at random from 7-1' (a) with d a random member of T> but conditional 
on the particular value of the vector a = a(d). 

Observe that each element H' in 7-L'(a) corresponds in a natural way to an element H 
in 71(a), obtained by identifying the points in A a ij\ and those in B T /j\ with the out-points 

and in-points, respectively, of bin (vertex) j (in the same way that elements in 71' (d) can be 
mapped to elements in 71(d)). Moreover, the H obtained this way has the same distribution 
as in 71(a), since the distribution of the degree sequence and thus a stay invariant after 
permuting the indices of the vertices in N' by a and r (so it does not matter if we condition 
to a particular a before or after applying a and r). Hence, setting U to be the event in 71(a) 
or 71' (a) that there is a complex sink-set containing at most m' /2 arcs, we have 

Pn(a)(U)=P W (a)(U). 

Henceforth we can do all calculations in 7i'(a), which simplifies the analysis as V[ 1 (n, m) did 
in Section HI 

By the same argument as in the proof of Proposition l4.ll in order to bound the probability 
of U, we can restrict our attention to complex sink-sets whose vertices are all reachable from 
some vertex vq. If the set of vertices reachable from vertex vq is a complex sink-set then 
essentially the same algorithm as in Section U] will terminate with S being such a sink-set. We 
restate the algorithm in the current setting as follows: 

Start with S = {fo}, R = A a r Vo \, and repeat the following until R is empty. Pick i G R, add 
to S the vertex v such that (j)(ai) G B r r v \ (if it is not already there), delete i from R and, if v 
was not already in S, add all elements in A a ^ to R. 

As in the proof of Proposition 14.14 the algorithm can be performed simultaneously with 
the generation of the random bijection <p and permutations a and r, piecemeal at each step 
of the algorithm. 

We need some notation to describe the generation of a and r. Let 

N+ = N 2 ,i U N 2 , 2 = {i£N' : df > 1}, N 2 = N ly2 U JV 2j2 = {% G N' : d~ > 1}, 

N+=N'\N+, N^=N'\N 2 . 

Also, at the start generate u.a.r. random permutations dj of A^ + and tj of NJ (j = 1, 2), and <j> 
of B, which we will view precisely as orderings of these sets (as in the proof of Proposition l4.ip . 
Initially, let <j(vq) be the first element of tj, where j is determined by vq G Nj. At each step, 

4>(o.i) is defined to be the next element of B in the ordering <fi. At each step where r _1 (f3((j)(ai))) 
has not yet been determined, choose v to be the next unused member of iV~ in the ordering 
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■fj. Set t(v) = /3(4>(a,i)). Then, if a(v) is not yet determined, define it as the next member of 
N^ (where v G A + determines j) in the ordering cfj. 

At any given stage, when k points 0(oj) have been chosen so far, let K C B denote the set 
of these points, which must be the first k points of </>. (This corresponds to the set K^ in the 
proof of Proposition 14. 1( we suppress the indices such as k for simplicity.) Also let J + denote 
the set of values a{v ) determined so far (note this is precisely {a{v) : v G S}), and somewhat 
asymmetrically, define J~ to be the set of vertices whose image under r has been determined. 
Then J~ = S if vq was chosen at some stage as v, and otherwise J~ = S \vq. Define the 
following random sets referring to a step after which precisely k < ml arcs have been exposed: 

jf = j+nNf, 

J 2 + = J+nA+, 

Jf = j- n N{ , 

j 2 = ,r r\ n 2 , 

and put tf = \Ji\ and so on. Then at each step of the algorithm, conditional upon having 
given cardinalities that can feasibly occur, the permutations a etc. determine these sets, and 
ensure that each of these sets occurs u.a.r. as subsets of N^ , A^, A-f and A^ respectively, 
and the same property holds for A as a subset of points in B with cardinality k. Furthermore, 
all these sets occur jointly independently of each other. For t = (tf, £%, t±, t 2 ), let Q(k,t) 
denote the probability space of such independently chosen sets, K and the J+ etc., with these 
cardinalities. Next define 

(36) 

(37) 
(38) 
(39) 
(40) 

By the form of the algorithm, at each iteration, precisely after the point when a new image 
of cr is exposed, we have that tj + t 2 = tj + t 2 and also 

tf = tf and tr = tt, i = 1, 2. (41) 

Moreover, in the event U vo that the algorithm terminates with S being a sink-set, we have 

k = k (42) 

and 

k > tf + t+ ( 43 ) 

if it is complex. 

Thus, setting F^t to be the event that the tuple t occurs in the algorithm after k arcs are 
exposed, and H the event that (I4ip and (142ft hold, we have by the union bound 



k -- 


- Ei^i' 




jeJ+ 


ii - 


-- \(j-u{vo})nN+\, 


it - 


= \(J-u{v })nN+\, 


ii ~- 


= \ie N{ : KnB t ^ 


*2 ~- 


= \ieN 2 : KHBi^ 



P n >(a)(U V0 )C Y, E Pw(a)(F k , t nH). (44) 

k<m'/2 t++t+=t-+t^<k 
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Note that H in also defined the space £l(k,t). 

We now note that, using the earlier observation that motivated defining il(/c,t), 

P*'(a)(*fe,t n H) < P w(a) (H I F fc>t ) = Pn( fc ,t)(H). 

Thus it suffices to show 

E E Pn (fc ,t)(#) = o(l/n'), (45) 

fc<m'/2 t++t+=t-+t~<k 

as the lemma follows from this using the argument in Proposition 14. II from (|26h onwards. 

Conditional on the values of k and t, the random variables k etc. depend only on the 
random permutations (j> etc., and in particular the distribution of k only depends on ft, ft 
and d + ; the distributions of t± and t 2 only depend on k and d~ ; the distributions of £]*" and 
t^ only depend on tj~, t 2 and d+. 

Case i. log 4 n' < k < m! /2 

Let g = 1/1000. Let E 1 be the event that |£[/(fc/3) - 1| < g and i 2 /(k/2) - 1 > -g. 
Let ^2 be the event that Itf/iJ — 1 1 < c/ and I^Ai ~~ 1| — 9- Let £3 be the event that 
\k/(tf + 2tJ ) — 1| < <?• Given any fixed values for A; and t with A; > log n', if both ()4ip 
and ([42 p hold then, clearly, at least one of E\, E2 and E3 must fail for n' sufficiently large. 
We claim that each of E\, E\\E2, and (E\ n £'2) \ E3 have probability o((n')~ 5 ) in the spaces 
U(k,t) occurring in (|43]l . Thus P^( fct )(if) = o((n') _5 ) in all cases, yielding (|4"5j) by summing 
over k and the constrained t. 

To verify the claims about the E{, the same type of argument as in Case 1 in the proof of 
Proposition 14.11 suffices. For instance, regarding E\, recall that K is a random subset of the 
points in B of cardinality k. We can instead assume that the points of B are independently 
choosen with probability k/m 1 , and condition later on obtaining precisely k points, which holds 
with probability ©(1/vfc). Therefore, it is enough to show that E\ has probability 1— o((n')~ 6 ) 
in the unconditional probability space where elements of B are chosen independently. Noting 
by (f35j) that |iV-f| ~ r, \N 2 ~\ ~ r and m! = G(n'), we have by ([9]) that the probability that 
the number t~[ of points chosen in N^ satisfies |£7/(A;/3) — 1| > g is o(l/(n') 6 ). Similarly, 
from Lemma [231 (applied to \N%\ copies of TPo2(A) with q = k/m'), the probability that 
\t 2 /(k/2) - 1| > g is o(l/(n') 6 ). For £ 2 , note that tf = |V D iV^I, where V denotes the 
set of the first t 2 elements of f 2 . Since V is a random subset of N 2 , and since by (j35p 
|A^2~ \ iVj I = o(|A^2~|), we have \tf /t 2 — 1| < g with probability o((n')~ 5 ) provided say 
t 2 > log n'. This is guaranteed by E\. The other statement in E2 works exactly the same, 
and thus the probability of E\ \ E2 is o((n') -5 ). Finally, for £3, conditional on a, we just 
consider the fixed number r + 02,1 + 02,2 of balls thrown randomly into the 02,1 + «2,2 bins 
conditional on at least two in each bin, (one ball in all other bins) and argue as for (129p to 
deduce that when t 2 bins are selected u.a.r., with high probability they contain approximately 
2t\ balls. 

Case 2. k < log 4 n' 

The argument for Case 2 in the proof of Proposition 14.11 applies almost directly to the 
current setting, with of course T"{d) and V'^ 1 (n,m) replaced by H'(d) and %'{a). The only 

23 



twist is that we have to show that, conditional upon a, the indegree sequence has maximum 
less than log 2 n with probability 1 — o(l/n). Such a sequence can be generated by putting 
r + n' — 0,2,1 elements randomly into a\ 2 + 0-2,2 bins subject to each bin receiving at least 
two balls. By (|35p the excess of balls over bins is o(r) and so the required property follows 
easily. □ 

Lemma 5.4. Let a be any feasible tuple satisfying T. Then a random preheart configuration 
in Q{a) is simple and strongly connected with probability 1/9 + o(l). 

Proof. A preheart configuration Q E Q{ a ) is strongly connected iff its underlying heart 
configuration H = H(Q) is. Note moreover that H is distributed as in 'H(a), by construction. 

Recall the definition of s-cycle from the proof of Proposition 14.21 and note that if H has no 
complex s-set of at most m! /2 arcs, then strong connectedness of H is equivalent to H having 
no s-cycles. Thus, in view of Lemma 15.31 we only need to show that a heart configuration in 
T-L{a) has no s-cycles with probability 1/9 + o(l), and that when inserting m — m! vertices in 
the arcs in order to generate a preheart configuration Q G Q(a) we get a simple digraph a.a.s. 

Since T holds, we have (|35p . We first claim that this implies that a.a.s. the number S of 
pairs of points that lie in the same in-bin is 0{r). Let 77.2 := 02,1 + 02,2 which must be r — o{r). 
We have a distribution of r + 77,2 points into ni in-bins chosen u.a.r. conditional upon each bin 
receiving at least two points. If r — 772 = o(logr) say, immediately S < r + 0(log r) = 0(r). 
If on the other hand r — 772 — > 00 (but recall it is o(r)), then this multinomial distribution can 
be approached by ni independent 2-truncated Poissons conditional upon having sum r + 77,2 
(see Lemma l2.ip . Combining Lemmas 12.41 and 12.21 we deduce that S = 0{r) with probability 

1 - 0((r - 77 2 ) 1 /2 e -log 3 r) = 1 _ Q ( 1 )_ 

The same holds for out-bins, so we may assume that the number of ways of choosing a set 
{cji, 02} of out-points in the same bin and a set {b\, 62} of in-points in the same bin is 0(r 2 ). 
The probability that a\,a2 are matched to 61,62 thus creating a double arc is 0(l/r 2 ). The 
probability that a given double arc in H gets no vertex inserted during the construction of 
Q is (777' — 2)(t77 / — 1)/(tt7 — 2)(777 — 1) = 0(r 2 /n 2 ) = o(l). Combining these conclusions, the 
expected number of double arcs in H that get no vertex inserted during the construction of 
Q is o(l), and therefore Q is simple a.a.s. 

Let 

k 2 /2V 1 

Mfe = T^ t ( o and fi = lim /j, k = 2 log — - = log 9. 

f^J V 3 / k ^°° 1-2/3 

The number of s-cycles of order at most k in %{a) is asymptotically Poisson of mean /i^. This 
follows from estimating the factorial moments of this number of s-cycles in a similar way as 
in the proof of Proposition 14.21 The present case is simpler in two ways: firstly, there are no 
sets of vertices which are both a sink-cycle and a source-cycle, since this would imply having 
isolated cycles consisting of vertices of degree (1,1). Secondly, the fact that the number of 
bins with degree exactly (1,2) and the number of bins with degree exactly (2,1) are each 
concentrated around r, and that the number of points in bins with higher degrees is negligible 
makes calculations much simpler than those in the proof of Proposition 14.21 As before, the 
probability of having some s-cycles of order greater than k can easily be bounded by some fk 
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such that linifc-i.oo fk = 0. Therefore, the probability of having no s-cycles is e ^ + o(l) as 
required. □ 

Finally, we proceed to prove Theorem 11.11 for the case c — > 1. Denote by K(n,m) the 
number of strongly connected digraphs with n vertices and m arcs. Given any degree se- 
quence d £ T>, there are exactly m\(m' jrn) preheart configurations in Q(d). Thus, in view of 
Lemma 15. II and setting A to be the event simple and strongly connected, we can write 



K(nm )-\- ml{m ' /m)P ^ {A) 



dev 



(m - 1)!E 5 (m'P Q(d) (A)) ^ ^ Pv (E) 
(m-1)! (e A -l) 2 " / , p (A) \ (46) 



since P (E) ~ 2nn J +x _ c) ~ 2 ^J_ n) by Lemma E21 



To estimate E^ ( m'PQ,,-((A) J we will restrict ourselves to the event T. If T holds, then (|35p 
gives m' ~ 3n'/2 ~ 3(m — n). From Lemmata 15.31 and 15.41 for any any a satisfying T, we have 

P S(a)(^)~^ 

Moreover, from Lemma 15.21 we have that E.g(m'(l — lp)) = o(l) and in particular P(r) = 
1 + o(l). Therefore, 



= (1 + o(l))3(m - n)P e(n>ro) (4 | T) + o(l) 
~ (m — n)/3. 

Combining this with (|46p . we obtain ([2]) and thus complete the proof of the theorem. 



6 Denser case: c — > oo 

In this section, we treat the case that c — > oo with c = O(logn). For such c, it follows easily 
from (|7|) that 

c = A + o(l). (47) 

Our goal is to obtain the asymptotic number of strongly connected digraphs in this case, and 
therefore complete the proof of Theorem 11.11 The main result in this section is the following. 

Proposition 6.1. For c := m/n — > oo with c = O(logn), a random digraph in Qi^{n,m) is 
a.a.s. strongly connected. 

This result, combined with Theorem II. 2\ gives the asymptotic number of strongly con- 
nected digraphs in the case that c — > oo with c = O(logn), which by (|4T|) is asymptotic to ([3]), 
and thus the proof of Theorem 11.11 is complete. 
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Proof. As explained at the start of Section U it suffices to show that a.a.s. there are no sink- 
sets. As before, we let s be the cardinality of a hypothetical sink-set. By duality it suffices to 
consider only s < n/2. 

Let G £ Gi t i(n,m). We consider two cases. Let K be fixed, and chosen sufficiently large 
as determined by the argument in Case 2 below. 

Case 1: s < c K 

Let Ni be the number of vertices of outdegree 1 in G. Define / = A/(e A — 1) + n~ 1 ' 3 . The 
probability that one truncated Poisson r.v. equals 1 is A/(e A — 1). Hence, by ([8]), in the space 
T> with Aq interpreted in the natural way, we have 



P(Aq > f(c)n) = e 



-U(\og 3 n) 



Then the same conclusion holds in D by Lemma 13.2( a). This also transfers to the random 
graph G £ Qn(n, m) by Lemma 13.2( b) and Corollary 13. 3[ which show that probabilities 
multiply by at most e°( log n \ 

We will condition on N\ = n\ where n\ < f(c)n, and consider the set N = {n\ : n\ < 
f(c)n, P(iVi = m) > 1/n 2 }. Then P(JVi $ N) < 1/n = o(l). Let H denote the event of not 
being strongly connected. We will show that 

max P(H I JVi = m) = o(l), 

which implies the result immediately. It helps to consider separately the event J that the 
maximum in- or outdegree in the digraph is less than log n. By (jl]) and Lemma l2.2[ P(J) = 
1 — o(l/n 2 ). So, letting X s denote the number of sink-sets of cardinality s, the above equation 
follows if we show 

Y, E(X S A lj | m = m) = o(l) (48) 

\<s<n/2 

for all m £ N . 

By symmetry, we can assume the m vertices of outdegree 1 are specified in advance, so we 
may work in the restricted model, Qx t i(n, m,ni), in which V{G) is partitioned into two sets of 
vertices, n\ in a set A all of outdegree 1, and the rest in a set B all of higher outdegrees. This 
is equivalent to Gi,\{n, m) conditioned on the set of vertices of outdegree 1 being precisely A. 
We use E* to denote expectation in this probability space. 

We will bound the probability p(s,i,q) that G £ D and some given set of vertices S is a 
sinkset of G (also put R = V \ S), where \S H A\ = i and the set Q of arcs with both ends 
in S satisfies \Q\ = q. The vertices in B have outdegree at least 2, so q > 2s — i. By vertex 
symmetry, 

^(Xs Mj) = jz{ n l)( n ~^-) E P(s,i,q)<n s ^2(f(c)y P (s,i,q). (49) 

j=0 ^ ' ^ ' q>2s-i i,q 

We bound p(s,i,q) using a switching technique. Take any digraph G £ J with a sinkset 
S as above, choose a set Q' of arcs with both ends not in S and with \Q'\ = q, match up the 
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arcs in Q with those in Q' in any manner, delete all arcs in Q and Q' , and for each matched 
pair uv G Q and u'v' in Q' , add the arcs uv' and u'v. We call this operation a switching. 
The number of ways it can be performed on G without creating any multiple arcs depends 
on the maximum degree (in- or out-) of G, which we denote by A. For each arc uv 6 Q, 
there are at most A arcs wv and at most A 2 arcs wx excluded from choice as u'v' due to 
causing double arcs. A similar number of exclusions of the form wx come from arcs ux also 
at least m — 2s A arcs have both ends in B. Hence the number of valid switchings is at least 
(m — 0(A 2 + sA)) q . Performing such a switching produces some digraph G' with the same 
(in,out)-degree sequence as G. How many such switchings can produce the same digraph G'l 
Assume G has r arcs directed from R to S, so G' has r + q such arcs. Choose q of these and 
pair them up with the q arcs of G' from S to R, to reverse the switching. This gives an upper 
bound (some reverse switchings may be invalid) of say (r + q) q digraphs G which can produce 

a. 

At this point, we may deduce that the contribution to p(s, i, q) from digraphs G with the 
given values of r and A is at most 

(r + q) q fO(l)(r + q)\ (r . ()) 



(m - 0(A 2 + sA))i \ m 

since G G J and A < log n. Note that the contribution to p(s, i, q) from all r such that 
r < c 3 s 3 or r < 18q is 

/( c + g )0(l)\« 



ni 



(51) 



To eliminate the influence of unusually large values of r will require a more elaborate argument. 
If a vertex v of S in G" is adjacent from k vertices in R, where k > 8c, perform an additional 
switching to G': choose k — [4cJ arcs u±v, . . . , - Ufc_|4c| u > u i £ R, an d replace them by arcs U{Wi 
with each Wi £ R (without producing multiple arcs). This produces a digraph G\ having A 
as its set of vertices of outdegree 1. The number of ways of performing this switching is at 
least (omitting floor functions for simplicity) ( fc _ 4c ) (n — s — A) k ~ 4c since each vertex ui has 
outdegree at most A. Each possible G± is produced in at most ( fc !™4 C )s ways, so the number 
of G divided by the number of G\ is at most 



m is 



k~4cl ^ n f „„ //„\fc— 4cs (i i /0/„/„ i a /^Wk-ics 

n — S — A)* 



lfc-4c. 



< s(ec//fc) fc - 4cs (l + <9(s/n + A/n)) fc - 4cs = 0(s)(2ec/fe) fc/2 , 



bounding the upper binomial above by (em/(k — 4c)) 4c and the lower one below by (k/(k — 
4c)) fc ~ 4c , and using 8c < k < A < log 2 n. 

If any vertex of G\ in A is still adjacent from more than 8c vertices of B, we may repeat 
the previous step, to obtain G2, and so on, up to some graph G" . If the switching is applied 
to j vertices of indegrees k±, . . . ,kj with ]P ki = k, the factors multiply to give 

s^ecf/ 2 \{kT k ^ < s i{ecf/ 2 {j/k)-~ k / 2 

i 

by convexity of x x . The worst case is j = s which shows that the number of possible G' is 
s(0(l)c) / (k / s)) k ' 2 times the number of G" , where k > r — 8cs. Note that G" G Qi,i(n, m, n\). 
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Thus, the contribution to p(s, i, q) from G with given r is at most 

(Q(l))*(r + g)« ( Q(1)cb \ (r " 8cs)/2 gS ^ (52) 

m q \r — 8cs J 

where, by the conclusion using G' , the second factor can be taken to be 1 for any particular r 
(such as r < 8cs). Note that q < sA < c log n. For r larger than c 3 s 3 and larger that 18g, 
factor (0(l)cs/(r — 8cs))' r—8cs " is at most r _r ' 4 (for large enough n) and s < r r ' 18 , so the 
product, summed over such r, is (o(l)/m) q . In view of this and of (|5ip . (|49p gives 

E*(X S A lj) < n s ^(/(c)) J ((c + q)°M/m)*, 

i,q 

where i < s and q>2s — i = s + s — i. So this is at most 

i,t>0 

which is at most (c ^ 1 ) max{/(c), 1/m}) . Summing this over s < c^ gives o(l) for the 
contribution to (148 p from Case 1. 

Cose 2: c K < s < n/2 

Let N< 3 be the number of vertices of outdegree at most 3 in G, and let h = c 3 /(e c — l)+c 3 /n. 
Then EA^i" 3 < nh/2, and by comparing with a binomial r.v. with expected value nh/2, and 
using ([9]), we have 

P©(A r < 3 > hn) = e- n ( hn 1 = o(m _1 e _ca ). 

So by Lemma [3T2T T a.b) . we deduce that iV< 3 < /in a.a.s. in Qi : i(n,m), and it suffices to prove 
that, conditional on this event, a.a.s. there are no sink-sets with cardinality s in the range 
under consideration. 

Henceforth, we consider V'i ^n^m) (defined near the start of Section H|) conditional upon 
a fixed outdegree sequence satisfying N< 3 = n^ 3 for some n< 3 < hn. Again by Lemma l33l b). 
it is enough to show that if R is the event that there exists a sink-set of size s satisfying 
c K <s< n/2, 

P(R) = o(e" c2 ). (53) 

We note that our usual approach to proving properties of the degree sequence would be to 
work with independent truncated Poisson r.v.s for the degree sequences, prove what we want, 
and then condition on the sums. However, the last step increases probabilities of bad events 
in a manner unacceptable for the present argument. To avoid this, we define an auxiliary 
sequence d^, . . . , d~ of independent copies of Bin(n, c/n), where n = (1 + S)n and c = (1 + 5)c, 
and we set 

S = e/8, e = 0.1. 

This sequence will be used to stochastically dominate some random variables defined on the 
indegree sequence of V[ 1 (n, m). 
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We next define some events that hold with high probability for the degree sequence. Let 
A = [5 log nj log log n + c 2 \ (so in particular A < log n, which suffices for most of our 
argument). This A turns out to be a typical bound on the maximum indegree. Let 

Pi = P(TPo(A)=i) = ^ I ^ [ , 

Pj = P(Bin(n,c/n) = j) = ( n \ (c/n) j (l - c/nf~ j , 

and set 

jo = min{j > 1 : npj > log n}, j% = max{j : npj > log n}. (54) 

Define the interval / = {jo, . . . , J3} and let I' = {1, . . . , A} \ /. Informally speaking, / is the 
set of common indegrees and I' the set of rare indegrees. Let V and V be the set of vertices 
with indegrees in / and I', respectively. Define H to be the event that the following holds: 
d~ ax < A (so V and V partition the set of vertices); there exists a permutation a of {1, . . . , n} 
with the property that d^ < 1 + d~,« for each i e V; and moreover \V'\ = o(log n). (The 

'+1' in the inequality d~ < 1 + d~,-\ is to make it easier for our argument to cope with the 
fact that the Poisson variable is truncated at 1, whereas the binomial is not.) 

We make several claims whose proofs are postponed. The first is the following. 

Claim 1: P(JET) = 1 - o(e" c2 ). 

The rest of the proof consists of showing that E(X1#) = O(0. 93 s ). This, together with 



Claim 1, gives (j53|) and we are done. 

Given a set S of vertices (\S\ = s), we may generate V[ 1 (n,m) by specifying the random 
bijection eft last, which shows that 

, x fd-(S)\ d+{S) /d-(^ 4s - 3i 
P(5 is sinkset) < — — < ' 



ml \ m 



where % is the number of vertices in 5 of outdegree at most 3. Since the outdegree sequence 
was fixed and n^t 3 < hn, the number of sets S of size s with parameter i is 

n^ 3 \ fn — n^^\ fs^\ (hn) l n s ~ l (2en\ 



, ^ ■ ^^1 ^ — h - 

s — l J \lj s\ \ s J 

Putting these together, we can bound the expected number X of sink-sets of size s re- 
stricted to the event H by distinguishing cases according to the size of d~(S): 



■<«„) <£(?=) 



(l + e)cs^ 



AsAm / . \ 4s-3« 
t>(l+e)cs V 7 




= t)AH) 





i=o v ■ - L ^ m 

(55) 

We need some care in treating the terms with t > (1 + e)cs. Let t± = t{\ + e/2)/(l + e) 
and £ 2 = fc/(2(l + e)). Note that t 1 + t 2 =t with ti > (1 + e/2)cs and t 2 > (e/2)cs. 

Let Si and 5// be the subsets in S with degrees in I and I' respectively. 
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Claim 2: P(((T(Sj) > ti) A H) < e~ n ^ . 

Claim 3: P((d"(5//) > i 2 ) A iJ) = e - n (' lo g lo s n ). 

From these claims, it immediately follows that P((d~(S) = i) A if) = e '*', and with ([55]) 
in mind we note that 

AsAm / , \ 4s-3i / //-. . \ \ 4s-3i \ / //-. , •> \ 4s-3i\ 

zji) ^="((^=) ^»)=-(( fi ^=) )■ 

so then from (|55p and using m = en, 

E(X1 H ) < (1 + 0(1)) f (?=) \> ( { -^)"' M • (56) 

For i < s/100 (recalling e = 0.1 and s < n/2), 

< (2e(l.l))^— J < 0.92 s , 

whilst for i > s/100, 

^y h , {^A±^.y- X < (^)Vaoo (hlty < ( 2 , M! v.ooy < . 92 ,. 

Thus, (J56j) gives E(X1#) = O(s0.92 s ) < 0.93 s , as desired. It only remains to prove Claims 1-3. 
Proof of Claim 1. If Y ~ TPo(A), then 

P(Y > A) = 0(P(Y = A)) = 0((eA/A) A ) = 0(n-V c2 ). 

Thus the statement holds for the first part of H by taking union bound and using Lemma r3.2f a) . 

For the second part, consider a sequence d± , . . . , d~ of independent truncated Poisson r.v.s, 
and a sequence d± , . . . ,d~ of independent Binomial r.v.s, as follows: 

di ~ TPo(A), di ~ Bin(n, c/n), 

where h = (1 + 5)n and c = (1 + 5)c (recalling 5 = e/8). 

Recalling the definitions of pj and ]5j above (|54p , we have that 

P(dr= i)= p,~ e -A| and P(^=jl=P J -e-(^) 2 - ((1 + f c)J . 

Let lj and Yj be the numbers of vertices of degree at least j for each of the models, and 
similarly, Zj and Za the numbers of vertices of degree at most j. We have 

EYa = nP(dr > j), EYj = nP(d; > j), EZj = nP(dr < j), EZa = nP{dT < j). 
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Recall the definition of jo and j'3 in (f5~4"|) . and let j\ = c — y/c/100 and j'2 = (1 + 3<5/2)c. 
It is straightforward to check that 1 < jo < ji < j'2 < J3 < A. If j% < j < j'3, we easily 
verify that P(c^~ > j) = ®(j)j), and also that pj = o(pj) (by considering the ratios pj + \/pj 
and Pj/pj)- Hence, we have that P(d~ > j) = o(P(d~ > j)). If j\ < j < J2, we have that 
P(<^~ > j) < 3/4 (since TPo(A) is asymptotically normal with mean A and variance A, and 
truncation has a negligible effect on this), and clearly P(d~ > j) ~ 1 (for similar reasons or 
using second moment method). Therefore for j\ < j < J3 we have that EYj < (4/5)Elj. 
Moreover, note that Yj and Yj are binomially distributed, and for j in this range we have 
that EYj > npj > log n. Hence, by Q and taking a union bound, the probability that 
\YjfEYj - 1| > 1/10 or l^/EFj - 1| > 1/10 for some j in the range ji < j < j 3 is e - Q ( lo s 10n ). 
In particular, this implies that Yj < Yj for all j G [ii, J3] with probability 1 — e~ ' g ra ^. 

On the other hand, if jo < j < j\, we easily verify that P(d~ < j) = @(pj), and also that 
Pj = o(pj) (considering the ratios pj-i/pj and Pj/pj)- Therefore, EZj = o(EZj). Similarly as 
before, Zj and Zj are binomially distributed and EZj > npj > log n. Using ([9]) again, we 
conclude that Zj < Zj for all j G [jO) Ji] with probability 1 — e~ ™ g n ' . Here we distinguish 
the two cases EZj > (log n)/2 and EZj < (log n)/2, and for the second case use stochastic 

Summarising, we have that 



domination of Zj by a binomial r.v. of expectation (log n)/2. 



Yj<Yj, VjG[io + l,J3] (57) 

with probability 1 — e~^*- log n ', where we used that Zj + Yj + i = Zj + Kj+i = n. However, 
what we really want is a suitable modification of (|57p that holds for the range j G bcb.73] 
and incorporates the '+1' shift in the definition of H. To do this, we distinguish two cases. 
If jo > 1, then it is straightforward to verify that Pj -\n > log n, so the same argument as 
before but changing j'o to j'o — 1 shows that Yj <Yj < Sj-_i for all j G [jo, J3] with probability 
1 — e~^*- log n >. Otherwise if jo = 1, we trivially deduce from (|57p that Yj < Kf_i for all 
j G [jo, J3] with probability 1 — e _r ^ log n >. Putting everything together, we conclude that 



Yj<Yj„i, VjG[j ,j 3 ] 



-O(log 8 n) 



with probability 1 — e iZ( - log n > . Thus, if this last inequality holds, then we can rearrange 
{1, ... ,n} by some permutation a in such a way that d~ < 1 + d~,-\ for all i G /. 

Conditioning on the truncated Poisson r.v.s of the sequence d^ , . . . ,d~ having fixed sum 
m only multiplies the probability of failure by 0{yjm) = o(n). The claim follows immediately. 

Proof of Claim 2. Since we are restricting the probability space to the event H and the 
choice of S is uniformly randomised, we can bound the probability in question by replacing 
P((d~(5/) > h) A H) by P(d~(5) + s > h), where ch(S') = Eie5 C (informally speaking, 
d~ (S) + s is the total indegree of S after having replaced the original indegree sequence by 
d~[ + 1, . . . , d^ + 1.) In this model d~(S) ~ Bin(sn, c/n), and it is immediate to verify (using 
standard deviation bounds on binomials; see also (J9|)) that P(d~(S) > t\ — s) < e '*\ since 
ti/Ed~(5) > (1 + e/2)/(l + 5) 2 > 1 and s = o(*i). 
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Proof of Claim 3. Recall that V is the set of vertices with degrees in I' , and that H 
implies that \V'\ < log 13 n. So the contribution of these to d~(S) is at most log 16 n. Thus, if 
s > log 16 n, we have P(d~(Sj>) > t 2 ) = 0, since t 2 > ecs/2 > log 16 n. 

So we may assume that s < log n. In this case c < log ' n < log ' n (if say K > 100). 
Thus, c 2 is negligible in the definition of A and we have A ~ 5 log nj log log n. (Here we must 
be precise, since the trivial bound A < log n is not enough for this part of the argument.) 
Let us fix any r < log n and restrict to the event that \V'\ = r. Then we may use a model 
in which elements of S are chosen independently, each with probability s/n, and condition on 
the size s being achieved. Before conditioning, the number Z = \V' n S\ of red vertices in S 

is Z ~ Bin(|V'|, s/n), and conditioning on the size s multiplies any probabilities by 0(s -1 ' 2 ). 
Note that EZ = o(l) and thus, by elementary consideration of the binomial distribution, 
P(Z > j) = 0(P(Z = j)). Hence 

P(tT (V' nS)>t 2 Aff)< P(Z > t 2 /A) = 0(P(Z = t 2 /A)) = 
7 Loop-free case 

This section treats the case that digraphs are not permitted to have loops. We prove Theo- 
rems [131 and [T31 which are analogues of Theorems 11.11 and 11.31 To prove these theorems, we 
need the following result, which is similar to Lemmas 12.41 and 12.31 

Lemma 7.1. Let Y±, . . . , Y^ , Yj~~ , . . . , Y^ be independent r.v.s with TPofc(A) distribution, for 
fixed k and for < A < log N. Let c = EY^ . Then for any t > \^Nlog 3 N we have 



N 



Y,Y t + Yr-c 2 N 



i=l 



>t\ =0 e 



-(t 2 /8A0 1/5 



asymptotically as N -> oo. 

Proof. The argument is almost identical to that of Lemma 12.31 so we just state the main 
differences. Here, we redefine A = (t 2 /8A^) 1 /5, y max = max 1 < i < JV {r. + ,F.-}, W t = Y+Y~ - c 2 
and W* = Wil Ei , where E { is the event that Y+ < A and Y~ < A. Note that A = 
0(log ' N), and that A < c < (1 + o(l))logiV. It only remains to find appropriate bounds 
on P(i / max > A), |EW*| and \W* — EW*\, and then apply the same steps as in the proof of 
Lemma I2T31 The bound P(Ymax > A) = 0(e) is obtained analogously. In view of (jl]) and 
the fact that E((c — Y~) ly- >A ) < 0, we easily deduce 

|EWf | = E ((c + Y+) l y +< A ) E ((Y.~ - c) l y _ >A ) < 2cE (yr l y _ >A ) = 0(e" A ). 
Finally, we have k 2 - c 2 < W* < A 2 - c 2 , and therefore \W* - BW*\ < A 2 . D 
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Proof of Theorem 11.51 After extending Lemma [3T21 to the loop- free case, the proof is iden- 
tical to that of Theorem 11.31 So we just describe this extension of Lemma |3.2| which requires 
inserting an e~ c factor in the asymptotic expressions in parts (b) and (c). The main adjust- 
ment in the proof is to redefine F = exp(— Dq — D + D~ /2), where Dq = — Y17=i dtdT- Instead 
of Lemma [3,11 we use a version which excludes loops. Again, we can use [BJ Theorem 4.6] with 
digraphs loop-free digraphs interpreted as bipartite graphs with a specific perfect matching 
being forbidden. Under the same conditions as Lemma 13. 11 this implies that the probability 
that a random element of V(d) has no loops and no multiple arcs is 

«p \-hpi^ - i t W - DW - 1) + o (£) ) , 

uniformly for all d. B2 is the event that \Dq — c\ > t or \D + D~ /2 — A + A~/2| > t. We bound 
the probability that \Dq — c\ > t using Lemma l7.ll □ 

Proof of Theorem 11.41 Again, we only need to point out how to change the proof of The- 
orem [TTTJ For the case that c is bounded and bounded away from 1, we simply extend Propo- 
sitions 14.11 and 14.21 in Section U] to the loop-free case, and combine them with Theorem 11.51 
Proposition 14.11 implies its own extension in this new setting, since the probability of an s-set 
when conditioning on loop-free digraphs can only increase by the inverse of the probability 
of having no loops, which is @(1) by comparing Theorems 11.51 and 11.31 Proposition 14.21 is 
extended as follows. 

Proposition 7.2. Suppose that c = ra/n is bounded and bounded away from 1. The probability 
that a digraph in Qi^{n,m) has no plain s-set is asymptotic to 

c (2/e*-i/ e 2*) eA ( eA ~ 1 ~ A) 2 . . 

( e 2A_ e A_ A )( e A_!)' ^ 

with A determined by the equation c = Ae A /(e A — 1). 

Proof. The argument is almost identical to that of Proposition 14.21 We sketch the main 
differences. C^ is again the number of s-cycles of order at most k but we exclude s-cycles of 
order 1 since they will be regarded as loops. D is redefined to be the number of loops and 
double arcs. We have 



and 



A 2{c/e x y - (c/e 2X y 
^ 3 



» = lim ^ = log V x ,\ " A " ; - c(2/e A - 1/e 



,2A^ 



(e 2A -e A -A)(e A - l) 
k^^ K "°V e A (e A -l-A) 2 

The rest of the argument consists in bounding the probability of having s-cycles of order 
greater than k for large k and showing that C^ and D are asymptotically jointly independent 
Poisson with expectations E-p x x t n ,m)Ck ~ Mfe an d ~E>Vi i(n,m)D ~ c + A 2 /2. □ 
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The formula ([2]), for the very sparse case (c —> 1) of Theorem 11.11 remains unchanged: 
in the proof of Lemma 15.41 one can easily see that the expected number of loops that get no 
vertices inserted while creating the preheart from the heart is o(l) using an approach similar 
to the one for double arcs. 

Finally, for the denser case (c — > oo with c = O(logn)) it suffices to verify that Proposi- 
tion [6J] is still valid if loops are not permitted. Actually, the argument in Section [6] works for 
this setting with only the following trivial modifications. Note that for the first case in the 
proof (s < c K ), the initial switchings do not create or destroy loops. The additional switchings 
can be performed in at least (^._ 4c )(^ — s — A — l) fc_4c ways without creating loops (which 
only requires replacing A by A + 1) and the resulting bounds obtained in Section [6] are unaf- 
fected. The argument for the second case of the proof (c K < s < n/2) remains valid with the 
only difference that we have to additionally condition on having no loops. The extra effect of 
forbidding loops gives an additional asymptotic e~ c factor to the probability in Lemma [3.2f b) 
(see the proof of Theorem 11.51 for the extension of Lemma 13.21 to loop-free digraphs). Since 
e~ c = o(e~ c ~ A ' 2 ), showing (|53p still suffices in the loop-free context. □ 
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